Skip to content

Conversation

@pankaj-bind
Copy link
Contributor

@pankaj-bind pankaj-bind commented Mar 28, 2025

Issue #3 fixed
This PR optimizes the allSubsets method in CTOrderedSet by replacing the inefficient combinations-based implementation with a bitmask-based approach. Additionally, it introduces a new performance test, testAllSubsetsPerformance, in CTOrderedSetTest to ensure the method generates the correct number of subsets (1022 for a set of size 10) and completes within 1 second.

CTOrderedSet.class.st

Before

{ #category : #'as yet unclassified' }
CTOrderedSet >> allSubsets [
	"Generate all possible subsets of this self without self and an empty set.
	Example:
	{1, 2, 3} => {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}"
	
	"This implementation might not be very fast and has to be improved"
	^ (self combinations copyWithout: self asArray) collect: [ :eachArray |
		self species withAll: eachArray ]
]

After

{ #category : #'as yet unclassified' }
CTOrderedSet >> allSubsets [
	"Generate all possible subsets of the receiver, excluding the receiver itself and the empty set.
	Uses a bitmask approach for efficiency.
	Example: #(a b c) => #(a), #(b), #(c), #(a b), #(a c), #(b c)"
	
	| subsets size |
	size := self size.
	subsets := OrderedCollection new: (1 bitShift: size) - 2. "Reserve space for 2^n - 2 subsets"
	
	1 to: (1 bitShift: size) - 2 do: [ :mask |
		| subset |
		subset := self species new.
		1 to: size do: [ :index |
			(mask bitAt: index) = 1 ifTrue: [ subset addLast: (self at: index) ] ].
		subsets add: subset ].
	
	^ subsets asArray
]

CTOrderedSetTest.class.st

Added

{ #category : #'performance tests' }
CTOrderedSetTest >> testAllSubsetsPerformance [
	| set time |
	set := CTOrderedSet withAll: (1 to: 10) asArray. "Set of size 10"
	
	time := [ set allSubsets ] timeToRun.
	self assert: set allSubsets size equals: (1 bitShift: 10) - 2. "1022 subsets"
	self assert: time < 1 second description: 'Should complete within 1 second for size 10'.
]

@jordanmontt

@pankaj-bind
Copy link
Contributor Author

@jordanmontt please look for this one


time := [ set allSubsets ] timeToRun.
self assert: set allSubsets size equals: (1 bitShift: 10) - 2. "1022 subsets"
self assert: time < 1 second description: 'Should complete within 1 second for size 10'.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this fail used to fail before?

@jordanmontt
Copy link
Member

Interesting optimization. But, to be sure that it is faster we need to don some benchmarks. In Pharo you can use the timeToRun method. Which for this case is enough. The receiver of timeToRun needs to be a block. Could you run some quick benchmarks please

@pankaj-bind
Copy link
Contributor Author

pankaj-bind commented Apr 19, 2025

  1. The testAllSubsetsPerformance is a new test, so it didn’t exist before. I tested the old allSubsets implementation (combinations-based), and it passed. The new implementation also passes.

  2. Benchmarks: I compared old and new allSubsets for set sizes 8, 10, and 12:

Set Size Old Time (ms) New Time (ms) Speedup (%)
8 0.33 0.23 140.77
10 1.59 1.17 136.11
12 7.22 5.03 143.51

The speedup reflects the percentage improvement. Let me know if further adjustments are needed!

image

@jordanmontt
Copy link
Member

Great! Sorry for taking time to merge this

@jordanmontt jordanmontt merged commit 21057ee into pharo-containers:master May 2, 2025
0 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants